Skip to content

feat: add MessageGroupId in queues to activate fair-queue feature#4752

Merged
joybytes merged 6 commits intomainfrom
feat/tasks-propagate-message-group-id
Mar 23, 2026
Merged

feat: add MessageGroupId in queues to activate fair-queue feature#4752
joybytes merged 6 commits intomainfrom
feat/tasks-propagate-message-group-id

Conversation

@joybytes
Copy link
Contributor

@joybytes joybytes commented Feb 25, 2026

Summary

  • Added env variable ENABLE_SQS_MESSAGE_GROUP_IDS to control on and off switch to fair-queue (it's set to ON on this PR)
  • Passed MessageGroupId (service id) on all relevant queue calls for tasks file (shatter job rows, save tasks, retries)
  • Entry point for defining grouping for upload jobs is in rest.py via process_job.apply_async there we set service_id, and all subsequent async operations on tasks.py inherit that field by calling getattr(self, "message_group_id", None)

Testing

  • Deployed to dev-d and triggered relevant flows (one-off notification, bulk letters, email and sms, reports) to make sure the field is being propagated ✅
Screenshot 2026-02-25 at 12 20 06 Screenshot 2026-02-25 at 12 21 48
  • Flows like nightly and scheduled-tasks were tested locally as it was easier to trigger that way

Ticket

Fair queue notifications-api (tasks)

@joybytes joybytes force-pushed the feat/tasks-propagate-message-group-id branch 2 times, most recently from 1062ed3 to e3e1a40 Compare February 25, 2026 10:31
@joybytes joybytes changed the title feat: propagate message group id in tasks file feat: add MessageGroupId to job/save task queueing Feb 25, 2026
@joybytes joybytes force-pushed the feat/tasks-propagate-message-group-id branch from e3e1a40 to 908e527 Compare February 25, 2026 14:41
@joybytes joybytes marked this pull request as ready for review February 25, 2026 15:41
@risicle
Copy link
Member

risicle commented Feb 25, 2026

This seems to be ignoring the entire point of adding NotifyTask.message_group_id in alphagov/notifications-utils#1318

The point is that you don't have to go digging around everywhere to find the id you want to use for MessageGroupId, you just feed it in to the first celery task you send to the queue and in other celery tasks you simply forward from self.message_group_id to MessageGroupId=....

I did all of this in https://github.com/alphagov/notifications-api/pull/4693/changes#diff-cca5917c620894fe770faa88835e9e186285d5b91a55fca992eb94d5dc87e635

@joybytes
Copy link
Contributor Author

joybytes commented Feb 25, 2026

This seems to be ignoring the entire point of adding NotifyTask.message_group_id in alphagov/notifications-utils#1318

The point is that you don't have to go digging around everywhere to find the id you want to use for MessageGroupId, you just feed it in to the first celery task you send to the queue and in other celery tasks you simply forward from self.message_group_id to MessageGroupId=....

I did all of this in https://github.com/alphagov/notifications-api/pull/4693/changes#diff-cca5917c620894fe770faa88835e9e186285d5b91a55fca992eb94d5dc87e635

@risicle the point of entry for a task needs an explicit message_group_id (in this case service_id) right?

if you're re-enqueing the task again using "self" is valid (like I did for retries)

at some point in the journey, we need an explicit grouping, otherwise where does it get it from? we need to pass on service_id

@risicle
Copy link
Member

risicle commented Feb 25, 2026

See my PR, it gets done in run_scheduled_jobs and create_job (in app/job/rest.py), because process_job itself also has a service_id context, which the shattered tasks can inherit their group id from.

@risicle
Copy link
Member

risicle commented Feb 25, 2026

We need to feed the message group id in at the point where a process becomes service_id-specific, ideally nowhere else.

@joybytes joybytes force-pushed the feat/tasks-propagate-message-group-id branch 5 times, most recently from f1af5aa to 91d6414 Compare February 25, 2026 17:47
@joybytes
Copy link
Contributor Author

We need to feed the message group id in at the point where a process becomes service_id-specific, ideally nowhere else.

@risicle done

@joybytes joybytes force-pushed the feat/tasks-propagate-message-group-id branch from bb72345 to f36f318 Compare February 25, 2026 19:58
@joybytes joybytes marked this pull request as draft February 25, 2026 19:59
@joybytes joybytes force-pushed the feat/tasks-propagate-message-group-id branch from fff72cc to b21c3a1 Compare February 26, 2026 09:18
@joybytes joybytes marked this pull request as ready for review February 26, 2026 09:32
@risicle
Copy link
Member

risicle commented Feb 27, 2026

I think we need to inject the MessageGroupId in run_scheduled_jobs and check_for_missing_rows_in_completed_jobs too - the two other places job-related tasks get initiated (see https://github.com/alphagov/notifications-api/pull/4693/changes#diff-859b3fa7698d3dc1be32ef99742c5d8620fab88d231fb3e4078752f7dc5bc05f)

FWIW this is why I'm unsure about whether we should be adding MessageGroupId bit-by-bit rather than just doing everything all at once like I did in my demo PR. We're liable to miss bits this way, and it's not like we'll be able to use ENABLE_SQS_MESSAGE_GROUP_IDS to enable the different parts separately - it's a single global flag.

@joybytes joybytes force-pushed the feat/tasks-propagate-message-group-id branch from b21c3a1 to ea70df8 Compare March 10, 2026 14:33
@joybytes joybytes marked this pull request as draft March 10, 2026 14:59
@joybytes joybytes force-pushed the feat/tasks-propagate-message-group-id branch 7 times, most recently from d7819b2 to 620a8fe Compare March 11, 2026 18:16
@joybytes joybytes force-pushed the feat/tasks-propagate-message-group-id branch 3 times, most recently from 9305f6f to 06f7dab Compare March 12, 2026 10:41
@joybytes joybytes changed the title feat: add MessageGroupId to job/save task queueing feat: add MessageGroupId in queues to activate fair-queue feature Mar 12, 2026
@joybytes joybytes force-pushed the feat/tasks-propagate-message-group-id branch 2 times, most recently from 4098856 to 1baebe9 Compare March 17, 2026 11:15
@joybytes joybytes force-pushed the feat/tasks-propagate-message-group-id branch 3 times, most recently from 2287b85 to fe69570 Compare March 18, 2026 15:30
@joybytes joybytes force-pushed the feat/tasks-propagate-message-group-id branch from fe69570 to 7ecb3ee Compare March 18, 2026 15:58
@joybytes joybytes marked this pull request as ready for review March 18, 2026 16:16
Copy link
Member

@risicle risicle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is right, but if we're going to make it default to ENABLE_SQS_MESSAGE_GROUP_IDS = 1 in the code we should update notifications-aws to set it to 0 in the terraform before we merge this.

@joybytes
Copy link
Contributor Author

I think this is right, but if we're going to make it default to ENABLE_SQS_MESSAGE_GROUP_IDS = 1 in the code we should update notifications-aws to set it to 0 in the terraform before we merge this.

@risicle https://github.com/alphagov/notifications-aws/pull/2880

@joybytes joybytes merged commit 005800a into main Mar 23, 2026
10 checks passed
@joybytes joybytes deleted the feat/tasks-propagate-message-group-id branch March 23, 2026 13:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants